A linguistic data acquisition front-end for language recognition evaluation

نویسندگان

  • Gang Liu
  • Chi Zhang
  • John H. L. Hansen
چکیده

One of the major challenges of the language identification (LID) system comes from the sparse training data. Manually collecting the linguistic data through the controlled studio is usually expensive and impractical. But multilingual broadcast programs (Voice of America, for instance) can be collected as a reasonable alternative to the linguistic data acquisition issue. However, unlike studio collected linguistic data, broadcast programs usually contain many contents other than pure linguistic data: musical contents in foreground/background, commercials, noise from practical life. In this study, a systematic processing approach is proposed to extract the linguistic data from the broadcast media. The experimental results obtained on NIST LRE 2009 data show that the proposed method can provide 22.2% relative improvement of segmentation accuracy and 20.5% relative improvement of LID accuracy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Language development and acquisition in children

Language acquisition is a natural developmental process and is unique to Homo sapiens in which a child acquiring his or her mother tongue as a first language.  The simplest theory of language development is that children learn language by imitating adult language. A second possibility is that children acquire language through conditioning. Noam Chomsky put forward innateness hypothesis. Piaget ...

متن کامل

The Role of Sociolinguistics in Second Language Acquisition

Learning a new language also involves learning a broad system of norms for social relations.This study broadly showed how EFL learners’ speech act is conveyed from their nativecultures when they are communicating in English and demonstrated that there are somepossibilities of cross-cultural misunderstanding when interlocutors are engaged in the speechact of complimenting with native speakers of...

متن کامل

The Effect of English Vowel-Recognition Training on Beginner and Advanced Iranian ESL Learners

This study was an attempt to investigate the effect of vowel-recognition training on beginner and advanced Iranian ESL learners. A total of 36 adult Iranian ESL learners (18 advanced and 18 beginners) who were students of various majors at Memorial University (MUN) were recruited for the study. Advanced participants had the experience of living in Canada for at least three years while beginners...

متن کامل

Textual Enhancement across Linguistic Structures: EFL Learners' Acquisition of English Forms

The benefits of textual input enhancement in the acquisition of linguistic forms have produced mixed results in SLA literature. The present study investigates the effects of textual enhancement on adult foreign language intake of two English linguistic forms-subjunctive mood and inversion structures-to explore the role of the type of linguistic items in input enhancement studies. It also invest...

متن کامل

On the Effects of Linguistic, Verbal, and Visual Mnemonics on Idioms Learning

Finding more effective ways of teaching second language idioms has been a long standing concern of many teaching practitioners and researchers. This study was an endeavorto explore the effects of three linguistic mnemonic devices (etymological elaboration, keyword method, and translation) on EFL learners’ recognition and recall of English idioms. To achieve the purpose of the study, ninety male...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012